Characterizing the Effects of Intermittent Faults on a Processor for Dependability Enhancement Strategy

نویسندگان

  • Chao(Saul) Wang
  • Zhong-Chuan Fu
  • Hong-Song Chen
  • Dong-Sheng Wang
چکیده

As semiconductor technology scales into the nanometer regime, intermittent faults have become an increasing threat. This paper focuses on the effects of intermittent faults on NET versus REG on one hand and the implications for dependability strategy on the other. First, the vulnerability characteristics of representative units in OpenSPARC T2 are revealed, and in particular, the highly sensitive modules are identified. Second, an arch-level dependability enhancement strategy is proposed, showing that events such as core/strand running status and core-memory interface events can be candidates of detectable symptoms. A simple watchdog can be deployed to detect application running status (IEXE event). Then SDC (silent data corruption) rate is evaluated demonstrating its potential. Third and last, the effects of traditional protection schemes in the target CMT to intermittent faults are quantitatively studied on behalf of the contribution of each trap type, demonstrating the necessity of taking this factor into account for the strategy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Doctor: an Integrated so Ftware Fault Injec T Io N Envir Onment

This paper presents an integrateD sO ftware fault injeC T iO n enviR onment (DOCTOR) which is capable of injecting various types of faults with diierent options, automatically collecting performance and dependability data, and generating synthetic workloads under which system dependability is evaulated. A comprehensive graphical user interface is also provided. A special emphasis is given to th...

متن کامل

Software Dependability in the Tandem GUARDIAN System

Abstmct_Based on extensive field failure data for Tandem's GUARDIAN operating system, this paper discusses evaluation of the dependability of operational software. Software faults considered are major defects that result in processor failures and invoke backup processes to take over. The paper categorizes the underlying causes of software failures and evaluates the effectiveness of the process ...

متن کامل

A new strategy for controlling wind turbines against sensor faults and wake effects to harvest more electrical energy

This paper describes a new method for harvesting maximum electrical energy in wind farms. In proposing technique, the stochastic process principles are applied for detecting fault measurements of sensors. On the other hand, the wind farm is modeled by using fuzzy concept. Thereby the turbines are controlled against continuous changes in speed, direction and eddy currents of the blowing wind. To...

متن کامل

Design principles for processor maintainability

With the arrival of large real-time, time-shared systems, the requirement of system reliability has become even more demanding. The result of even a momentary system misbehavior could be catastrophic, since any disruption of service is experienced by all the users on-line at that time. Thus for real-time systems such as telephone switching systems, airline reservation systems, on-line teaching ...

متن کامل

Hardware and Software Transparency in the Protection of Programs Against SEUs and SETs

Processor cores embedded in systems-on-a-chip (SoCs) are often deployed in critical computations, and when affected by faults they may produce dramatic effects. When hardware hardening is not cost-effective, software implemented hardware fault tolerance (SIHFT) can be a solution to increase SoCs’ dependability, but it increases the time for running the hardened application, as well as the memor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2014  شماره 

صفحات  -

تاریخ انتشار 2014